17 research outputs found

    Multi-task Layout Analysis of Handwritten Musical Scores

    Get PDF
    [EN] Document Layout Analysis (DLA) is a process that must be performed before attempting to recognize the content of handwritten musical scores by a modern automatic or semiautomatic system. DLA should provide the segmentation of the document image into semantically useful region types such as staff, lyrics, etc. In this paper we extend our previous work for DLA of handwritten text documents to also address complex handwritten music scores. This system is able to perform region segmentation, region classification and baseline detection in an integrated manner. Several experiments were performed in two different datasets in order to validate this approach and assess it in different scenarios. Results show high accuracy in such complex manuscripts and very competent computational time, which is a good indicator of the scalability of the method for very large collections.This work was partially supported by the Universitat Politecnica de Valencia under grant FPI-420II/899, a 2017-2018 Digital Humanities research grant of the BBVA Foundation for the project Carabela, the History Of Medieval Europe (HOME) project (Ref.: PCI2018-093122) and through the EU project READ (Horizon-2020 program, grant Ref. 674943). NVIDIA Corporation kindly donated the Titan X GPU used for this research.Quirós, L.; Toselli, AH.; Vidal, E. (2019). Multi-task Layout Analysis of Handwritten Musical Scores. Springer. 123-134. https://doi.org/10.1007/978-3-030-31321-0_11S123134Burgoyne, J.A., Ouyang, Y., Himmelman, T., Devaney, J., Pugin, L., Fujinaga, I.: Lyric extraction and recognition on digital images of early music sources. In: Proceedings of the 10th International Society for Music Information Retrieval Conference, vol. 10, pp. 723–727 (2009)Calvo-Zaragoza, J., Toselli, A.H., Vidal, E.: Probabilistic music-symbol spotting in handwritten scores. In: 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 558–563, August 2018Calvo-Zaragoza, J., Zhang, K., Saleh, Z., Vigliensoni, G., Fujinaga, I.: Music document layout analysis through machine learning and human feedback. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 02, pp. 23–24, November 2017Calvo-Zaragoza, J., Castellanos, F.J., Vigliensoni, G., Fujinaga, I.: Deep neural networks for document processing of music score images. Appl. Sci. 8(5), 654 (2018). (2076-3417)Calvo-Zaragoza, J., Toselli, A.H., Vidal, E.: Handwritten music recognition for mensural notation: formulation, data and baseline results. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1081–1086. IEEE (2017)Campos, V.B., Calvo-Zaragoza, J., Toselli, A.H., Ruiz, E.V.: Sheet music statistical layout analysis. In: 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 313–318. IEEE (2016)Castellanos, F.J., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Document analysis of music score images with selectional auto-encoders. In: 19th International Society for Music Information Retrieval Conference, pp. 256–263 (2018)Grüning, T., Labahn, R., Diem, M., Kleber, F., Fiel, S.: READ-BAD: a new dataset and evaluation scheme for baseline detection in archival documents. CoRR abs/1705.03311 (2017). http://arxiv.org/abs/1705.03311Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations (ICLR) (2015)Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Quirós, L.: Multi-task handwritten document layout analysis. ArXiv e-prints, 1806.08852 (2018). https://arxiv.org/abs/1806.08852Quirós, L., Bosch, V., Serrano, L., Toselli, A.H., Vidal, E.: From HMMs to RNNs: computer-assisted transcription of a handwritten notarial records collection. In: 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 116–121. IEEE, August 2018Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marcal, A.R., Guedes, C., Cardoso, J.S.: Optical music recognition: state-of-the-art and open issues. Int. J. Multimed. Inf. Retrieval 1(3), 173–190 (2012)Sánchez, J.A., Romero, V., Toselli, A.H., Villegas, M., Vidal, E.: ICDAR2017 competition on handwritten text recognition on the READ dataset. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1383–1388. IEEE (2017)Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985

    Two Methods to Improve Confidence Scores for Lexicon-Free Word Spotting in Handwritten Text

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] Two methods are presented to improve word confidence scores for Line-Level Query-by-String Lexicon-Free Keyword Spotting (KWS) in handwritten text images. The first one approaches true relevance probabilities by means of computations directly carried out on character lattices obtained from the lines images considered. The second method uses the same character lattices, but it obtains relevance scores by first computing frame-level character sequence scores which resemble the word posteriorgrams used in previous approaches for lexicon-based KWS. The first method results from a formal probabilistic derivation, which allow us to better understand and further develop the underlying ideas. The second one is less formal but, according with experiments presented in the paper, it obtains almost identical results with much lower computational cost. Moreover, in contrast with the first method, the second one allows to directly obtain accurate bounding boxes for the spotted words.This work was partially supported by the Spanish MEC under FPU grant FPU13/06281, by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMAMATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, grant Ref. 674943).Toselli, AH.; Puigcerver, J.; Vidal, E. (2016). Two Methods to Improve Confidence Scores for Lexicon-Free Word Spotting in Handwritten Text. IEEE. https://doi.org/10.1109/ICFHR.2016.0072

    Querying out-of-vocabulary words in lexicon-based keyword spotting

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00521-016-2197-8[EN] Lexicon-based handwritten text keyword spotting (KWS) has proven to be a faster and more accurate alternative to lexicon-free methods. Nevertheless, since lexicon-based KWS relies on a predefined vocabulary, fixed in the training phase, it does not support queries involving out-of-vocabulary (OOV) keywords. In this paper, we outline previous work aimed at solving this problem and present a new approach based on smoothing the (null) scores of OOV keywords by means of the information provided by ``similar'' in-vocabulary words. Good results achieved using this approach are compared with previously published alternatives on different data sets.This work was partially supported by the Spanish MEC under FPU Grant FPU13/06281, by the Generalitat Valenciana under the Prometeo/2009/014 Project Grant ALMA-MATER, and through the EU Projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, grant Ref. 674943).Puigcerver, J.; Toselli, AH.; Vidal, E. (2016). Querying out-of-vocabulary words in lexicon-based keyword spotting. Neural Computing and Applications. 1-10. https://doi.org/10.1007/s00521-016-2197-8S110Almazan J, Gordo A, Fornes A, Valveny E (2013) Handwritten word spotting with corrected attributes. In: 2013 IEEE international conference on computer vision (ICCV), pp 1017–1024. doi: 10.1109/ICCV.2013.130Amengual JC, Vidal E (2000) On the estimation of error-correcting parameters. In: Proceedings 15th international conference on pattern recognition, 2000, vol 2, pp 883–886Fernández D, Lladós J, Fornés A (2011) Handwritten word spotting in old manuscript images using a pseudo-structural descriptor organized in a hash structure. In: Vitri'a J, Sanches JM, Hern'andez M (eds) Pattern recognition and image analysis: Proceedings of 5th Iberian Conference, IbPRIA 2011, Las Palmas de Gran Canaria, Spain, June 8–10. Springer, Berlin, Heidelberg, pp 628–635. doi: 10.1007/978-3-642-21257-4_78Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit Lett 33(7):934–942. doi: 10.1016/j.patrec.2011.09.009 Special Issue on Awards from ICPR 2010Fornés A, Frinken V, Fischer A, Almazán J, Jackson G, Bunke H (2011) A keyword spotting approach using blurred shape model-based descriptors. In: Proceedings of the 2011 workshop on historical document imaging and processing, pp 83–90. ACMFrinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224. doi: 10.1109/TPAMI.2011.113Gatos B, Pratikakis I (2009) Segmentation-free word spotting in historical printed documents. In: 10th International conference on document analysis and recognition, 2009. ICDAR’09, pp 271–275. IEEEJelinek F (1998) Statistical methods for speech recognition. MIT Press, CambridgeKneser R, Ney H (1995) Improved backing-off for N-gram language modeling. In: International conference on acoustics, speech and signal processing (ICASSP ’95), vol 1, pp 181–184. IEEE Computer Society, Los Alamitos, CA, USA. doi: http://doi.ieeecomputersociety.org/10.1109/ICASSP.1995.479394Kolcz A, Alspector J, Augusteijn M, Carlson R, Popescu GV (2000) A line-oriented approach to word spotting in handwritten documents. Pattern Anal Appl 3:153–168. doi: 10.1007/s100440070020Konidaris T, Gatos B, Ntzios K, Pratikakis I, Theodoridis S, Perantonis SJ (2007) Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int J Doc Anal Recognit 9(2–4):167–177Kumar G, Govindaraju V (2014) Bayesian active learning for keyword spotting in handwritten documents. In: 2014 22nd International conference on pattern recognition (ICPR), pp 2041–2046. IEEELevenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkMarti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46. doi: 10.1007/s100320200071Puigcerver J, Toselli AH, Vidal E (2014) Word-graph and character-lattice combination for KWS in handwritten documents. In: 14th International conference on frontiers in handwriting recognition (ICFHR), pp 181–186Puigcerver J, Toselli AH, Vidal E (2014) Word-graph-based handwriting keyword spotting of out-of-vocabulary queries. In: 22nd International conference on pattern recognition (ICPR), pp 2035–2040Puigcerver J, Toselli AH, Vidal E (2015) A new smoothing method for lexicon-based handwritten text keyword spotting. In: 7th Iberian conference on pattern recognition and image analysis. SpringerRath T, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 9:139–152Robertson S. (2008) A new interpretation of average precision. In: Proceedings of the international. ACM SIGIR conference on research and development in information retrieval (SIGIR ’08), pp 689–690. ACM, New York, NY, USA. doi: http://doi.acm.org/10.1145/1390334.1390453Rodriguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42(9):2106–2116. doi: 10.1016/j.patcog.2009.02.005 . http://www.sciencedirect.com/science/article/pii/S0031320309000673Rusinol M, Aldavert D, Toledo R, Llados J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International conference on document analysis and recognition (ICDAR), pp 63–67. doi: 10.1109/ICDAR.2011.22Shang H, Merrettal T (1996) Tries for approximate string matching. IEEE Trans Knowl Data Eng 8(4):540–547Toselli AH, Vidal E (2013) Fast HMM-Filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), pp 501–505Toselli AH, Vidal E (2014) Word-graph based handwriting key-word spotting: impact of word-graph size on performance. In: 11th IAPR international workshop on document analysis systems (DAS), pp 176–180. IEEEToselli AH, Vidal E, Romero V, Frinken V (2013) Word-graph based keyword spotting and indexing of handwritten document images. Technical report, Universitat Politécnica de ValénciaVidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques. In: 2015 13th International conference on document analysis and recognition (ICDAR), pp 741–745. IEEEWoodland P, Leggetter C, Odell J, Valtchev V, Young S (1995) The 1994 HTK large vocabulary speech recognition system. In: International conference on acoustics, speech, and signal processing (ICASSP ’95), vol 1, pp 73 –76. doi: 10.1109/ICASSP.1995.479276Wshah S, Kumar G, Govindaraju V (2012) Script independent word spotting in offline handwritten documents based on hidden markov models. In: 2012 International conference on frontiers in handwriting recognition (ICFHR), pp 14–19. doi: 10.1109/ICFHR.2012.26

    Word graphs size impact on the performance of handwriting document applications

    Full text link
    [EN] Two document processing applications are con- sidered: computer-assisted transcription of text images (CATTI) and Keyword Spotting (KWS), for transcribing and indexing handwritten documents, respectively. Instead of working directly on the handwriting images, both of them employ meta-data structures called word graphs (WG), which are obtained using segmentation-free hand- written text recognition technology based on N-gram lan- guage models and hidden Markov models. A WG contains most of the relevant information of the original text (line) image required by CATTI and KWS but, if it is too large, the computational cost of generating and using it can become unafordable. Conversely, if it is too small, relevant information may be lost, leading to a reduction of CATTI or KWS performance. We study the trade-off between WG size and performance in terms of effectiveness and effi- ciency of CATTI and KWS. Results show that small, computationally cheap WGs can be used without loosing the excellent CATTI and KWS performance achieved with huge WGs.Work partially supported by the Generalitat Valenciana under the Prometeo/2009/014 Project Grant ALMAMATER, by the Spanish MECD as part of the Valorization and I+D+I Resources program of VLC/CAMPUS in the International Excellence Campus program, and through the EU projects: HIMANIS (JPICH programme, Spanish Grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, Grant Ref. 674943).Toselli ., AH.; Romero Gómez, V.; Vidal, E. (2017). Word graphs size impact on the performance of handwriting document applications. Neural Computing and Applications. 28(9):2477-2487. https://doi.org/10.1007/s00521-016-2336-2S24772487289Amengual JC, Vidal E (1998) Efficient error-correcting Viterbi parsing. IEEE Trans Pattern Anal Mach Intell 20(10):1109–1116Bazzi I, Schwartz R, Makhoul J (1999) An omnifont open-vocabulary OCR system for English and Arabic. IEEE Trans Pattern Anal Mach Intell 21(6):495–504Erman L, Lesser V (1990) The HEARSAY-II speech understanding system: a tutorial. Readings in Speech Reasoning, pp 235–245Evermann G (1999) Minimum word error rate decoding. Ph.D. thesis, Churchill College, University of CambridgeFischer A, Wuthrich M, Liwicki M, Frinken V, Bunke H, Viehhauser G, Stolz M (2009) Automatic transcription of handwritten medieval documents. In: 15th international conference on virtual systems and multimedia, 2009. VSMM ’09, pp 137–142Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224Furcy D, Koenig S (2005) Limited discrepancy beam search. In: Proceedings of the 19th international joint conference on artificial intelligence, IJCAI’05, pp 125–131Granell E, Martínez-Hinarejos CD (2015) Multimodal output combination for transcribing historical handwritten documents. In: 16th international conference on computer analysis of images and patterns, CAIP 2015, chap, pp 246–260. Springer International PublishingHakkani-Tr D, Bchet F, Riccardi G, Tur G (2006) Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput Speech Lang 20(4):495–514Jelinek F (1998) Statistical methods for speech recognition. MIT Press, CambridgeJurafsky D, Martin JH (2009) Speech and language processing: an introduction to natural language processing, speech recognition, and computational linguistics, 2nd edn. Prentice-Hall, Englewood CliffsKneser R, Ney H (1995) Improved backing-off for N-gram language modeling. In: International conference on acoustics, speech and signal processing (ICASSP ’95), vol 1, pp 181–184. IEEE Computer SocietyLiu P, Soong FK (2006) Word graph based speech recognition error correction by handwriting input. In: Proceedings of the 8th international conference on multimodal interfaces, ICMI ’06, pp 339–346. ACMLowerre BT (1976) The harpy speech recognition system. Ph.D. thesis, Pittsburgh, PALuján-Mares M, Tamarit V, Alabau V, Martínez-Hinarejos CD, Pastor M, Sanchis A, Toselli A (2008) iATROS: a speech and handwritting recognition system. In: V Jornadas en Tecnologías del Habla (VJTH’2008), pp 75–78Mangu L, Brill E, Stolcke A (2000) Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Comput Speech Lang 14(4):373–400Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkMohri M, Pereira F, Riley M (2002) Weighted finite-state transducers in speech recognition. Comput Speech Lang 16(1):69–88Odell JJ, Valtchev V, Woodland PC, Young SJ (1994) A one pass decoder design for large vocabulary recognition. In: Proceedings of the workshop on human language technology, HLT ’94, pp 405–410. Association for Computational LinguisticsOerder M, Ney H (1993) Word graphs: an efficient interface between continuous-speech recognition and language understanding. IEEE Int Conf Acoust Speech Signal Process 2:119–122Olivie J, Christianson C, McCarry J (eds) (2011) Handbook of natural language processing and machine translation. Springer, BerlinOrtmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72Padmanabhan M, Saon G, Zweig G (2000) Lattice-based unsupervised MLLR for speaker adaptation. In: ASR2000-automatic speech recognition: challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW)Pesch H, Hamdani M, Forster J, Ney H (2012) Analysis of preprocessing techniques for latin handwriting recognition. In: International conference on frontiers in handwriting recognition, ICFHR’12, pp 280–284Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J, Stemmer G, Vesely K (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing SocietyPovey D, Hannemann M, Boulianne G, Burget L, Ghoshal A, Janda M, Karafiat M, Kombrink S, Motlcek P, Qian Y, Riedhammer K, Vesely K, Vu NT (2012) Generating Exact Lattices in the WFST Framework. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP)Rabiner L (1989) A tutorial of hidden Markov models and selected application in speech recognition. Proc IEEE 77:257–286Robertson S (2008) A new interpretation of average precision. In: Proceedings of the international ACM SIGIR conference on research and development in information retrieval (SIGIR ’08), pp 689–690. ACMRomero V, Toselli AH, Rodríguez L, Vidal E (2007) Computer assisted transcription for ancient text images. Proc Int Conf Image Anal Recogn LNCS 4633:1182–1193Romero V, Toselli AH, Vidal E (2012) Multimodal interactive handwritten text transcription. Series in machine perception and artificial intelligence (MPAI). World Scientific Publishing, SingaporeRybach D, Gollan C, Heigold G, Hoffmeister B, Lööf J, Schlüter R, Ney H (2009) The RWTH aachen university open source speech recognition system. In: Interspeech, pp 2111–2114Sánchez J, Mühlberger G, Gatos B, Schofield P, Depuydt K, Davis R, Vidal E, de Does J (2013) tranScriptorium: an European project on handwritten text recognition. In: DocEng, pp 227–228Saon G, Povey D, Zweig G (2005) Anatomy of an extremely fast LVCSR decoder. In: INTERSPEECH, pp 549–552Strom N (1995) Generation and minimization of word graphs in continuous speech recognition. In: Proceedings of IEEE workshop on ASR’95, pp 125–126. Snowbird, UtahTanha J, de Does J, Depuydt K (2015) Combining higher-order N-grams and intelligent sample selection to improve language modeling for Handwritten Text Recognition. In: ESANN 2015 proceedings, European symposium on artificial neural networks, computational intelligence and machine learning, pp 361–366Toselli A, Romero V, i Gadea MP, Vidal E (2010) Multimodal interactive transcription of text images. Pattern Recogn 43(5):1814–1825Toselli A, Romero V, Vidal E (2015) Word-graph based applications for handwriting documents: impact of word-graph size on their performances. In: Paredes R, Cardoso JS, Pardo XM (eds) Pattern recognition and image analysis. Lecture Notes in Computer Science, vol 9117, pp 253–261. Springer International PublishingToselli AH, Juan A, Keysers D, Gonzlez J, Salvador I, Ney H, Vidal E, Casacuberta F (2004) Integrated handwriting recognition and interpretation using finite-state models. Int J Pattern Recogn Artif Intell 18(4):519–539Toselli AH, Vidal E (2013) Fast HMM-Filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR’13). IEEE Computer SocietyToselli AH, Vidal E, Romero V, Frinken V (2013) Word-graph based keyword spotting and indexing of handwritten document images. Technical report, Universitat Politècnica de ValènciaUeffing N, Ney H (2007) Word-level confidence estimation for machine translation. Comput Linguist 33(1):9–40. doi: 10.1162/coli.2007.33.1.9Vinciarelli A, Bengio S, Bunke H (2004) Off-line recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720Weng F, Stolcke A, Sankar A (1998) Efficient lattice representation and generation. In: Proceedings of ICSLP, pp 2531–2534Wessel F, Schluter R, Macherey K, Ney H (2001) Confidence measures for large vocabulary continuous speech recognition. IEEE Trans Speech Audio Process 9(3):288–298Wolf J, Woods W (1977) The HWIM speech understanding system. In: IEEE international conference on acoustics, speech, and signal processing, ICASSP ’77, vol 2, pp 784–787Woodland P, Leggetter C, Odell J, Valtchev V, Young S (1995) The 1994 HTK large vocabulary speech recognition system. In: International conference on acoustics, speech, and signal processing (ICASSP ’95), vol 1, pp 73 –76Young S, Odell J, Ollason D, Valtchev V, Woodland P (1997) The HTK book: hidden Markov models toolkit V2.1. Cambridge Research Laboratory Ltd, CambridgeYoung S, Russell N, Thornton J (1989) Token passing: a simple conceptual model for connected speech recognition systems. Technical reportZhu M (2004) Recall, precision and average precision. Working Paper 2004–09 Department of Statistics and Actuarial Science, University of WaterlooZimmermann M, Bunke H (2004) Optimizing the integration of a statistical language model in hmm based offline handwritten text recognition. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 2, pp 541–54

    HMM word graph based keyword spotting in handwritten document images

    Full text link
    [EN] Line-level keyword spotting (KWS) is presented on the basis of frame-level word posterior probabilities. These posteriors are obtained using word graphs derived from the recogni- tion process of a full-fledged handwritten text recognizer based on hidden Markov models and N-gram language models. This approach has several advantages. First, since it uses a holistic, segmentation-free technology, it does not require any kind of word or charac- ter segmentation. Second, the use of language models allows the context of each spotted word to be taken into account, thereby considerably increasing KWS accuracy. And third, the proposed KWS scores are based on true posterior probabilities, taking into account all (or most) possible word segmentations of the input image. These scores are properly bounded and normalized. This mathematically clean formulation lends itself to smooth, threshold-based keyword queries which, in turn, permit comfortable trade-offs between search precision and recall. Experiments are carried out on several historic collections of handwritten text images, as well as a well-known data set of modern English handwrit- ten text. According to the empirical results, the proposed approach achieves KWS results comparable to those obtained with the recently-introduced "BLSTM neural networks KWS" approach and clearly outperform the popular, state-of-the-art "Filler HMM" KWS method. Overall, the results clearly support all the above-claimed advantages of the proposed ap- proach.This work has been partially supported by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon 2020 programme, grant Ref. 674943).Toselli, AH.; Vidal, E.; Romero, V.; Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences. 370:497-518. https://doi.org/10.1016/j.ins.2016.07.063S49751837

    Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] Existing transcripts for historic manuscripts are a very valuable resource for training models useful for automatic recognition, aided transcription, and/or indexing of the remaining untranscribed parts of these collections. However, these existing transcripts generally exhibit two main problems which hinder their convenience: a) text of the transcripts is seldom aligned with manuscript lines, and b) text often deviate very significantly from what can be seen in the manuscript, either because writing style has been modernized or abbreviations have been expanded, or both. This work presents an analysis of these problems and discusses possible solutions for minimizing human effort needed to adapt existing transcripts in order to render them usable. Empirical results presented show the huge performance gain that can be obtained by adequately adapting the transcripts, thus motivating future development of the proposed solutions.We are very grateful to Carlos Lechner and Celio Hernández who helped in the creation of the ground truth of the Alcaraz dataset. This work has been partially supported by the European Union (EU) Horizon 2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), EU project HIMANIS (JPICH programme, Spanish grant Ref: PCIN-2015-068) and MINECO/FEDER, UE under project TIN2015-70924-C2-1-R.Villegas, M.; Toselli, AH.; Romero Gómez, V.; Vidal, E. (2016). Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition. IEEE. https://doi.org/10.1109/ICFHR.2016.22

    ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] This paper describes the Handwritten Text Recognition (HTR) competition on the READ dataset that has been held in the context of the International Conference on Frontiers in Handwriting Recognition 2016. This competition aims to bring together researchers working on off-line HTR and provide them a suitable benchmark to compare their techniques on the task of transcribing typical historical handwritten documents. Two tracks with different conditions on the use of training data were proposed. Ten research groups registered in the competition but finally five submitted results. The handwritten images for this competition were drawn from the German document Ratsprotokolle collection composed of minutes of the council meetings held from 1470 to 1805, used in the READ project. The selected dataset is written by several hands and entails significant variabilities and difficulties. The five participants achieved good results with transcriptions word error rates ranging from 21% to 47% and character error rates rating from 5% to 19%.This work has been partially supported through the European Union's H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), and the MINECO/FEDER UE project TIN2015-70924-C2-1-R.Sánchez Peiró, JA.; Romero Gómez, V.; Toselli, AH.; Vidal, E. (2016). ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset. IEEE. https://doi.org/10.1109/ICFHR.2016.0120

    Contex-aware gestures for mixed-initiative text editings UIs

    Full text link
    This is a pre-copyedited, author-produced PDF of an article accepted for publication in Interacting with computers following peer review. The version of record is available online at: http://dx.doi.org/10.1093/iwc/iwu019[EN] This work is focused on enhancing highly interactive text-editing applications with gestures. Concretely, we study Computer Assisted Transcription of Text Images (CATTI), a handwriting transcription system that follows a corrective feedback paradigm, where both the user and the system collaborate efficiently to produce a high-quality text transcription. CATTI-like applications demand fast and accurate gesture recognition, for which we observed that current gesture recognizers are not adequate enough. In response to this need we developed MinGestures, a parametric context-aware gesture recognizer. Our contributions include a number of stroke features for disambiguating copy-mark gestures from handwritten text, plus the integration of these gestures in a CATTI application. It becomes finally possible to create highly interactive stroke-based text-editing interfaces, without worrying to verify the user intent on-screen. We performed a formal evaluation with 22 e-pen users and 32 mouse users using a gesture vocabulary of 10 symbols. MinGestures achieved an outstanding accuracy (<1% error rate) with very high performance (<1 ms of recognition time). We then integrated MinGestures in a CATTI prototype and tested the performance of the interactive handwriting system when it is driven by gestures. Our results show that using gestures in interactive handwriting applications is both advantageous and convenient when gestures are simple but context-aware. Taken together, this work suggests that text-editing interfaces not only can be easily augmented with simple gestures, but also may substantially improve user productivity.This work has been supported by the European Commission through the 7th Framework Program (tranScriptorium: FP7- ICT-2011-9, project 600707 and CasMaCat: FP7-ICT-2011-7, project 287576). It has also been supported by the Spanish MINECO under grant TIN2012-37475-C02-01 (STraDa), and the Generalitat Valenciana under grant ISIC/2012/004 (AMIIS).Leiva, LA.; Alabau, V.; Romero Gómez, V.; Toselli, AH.; Vidal, E. (2015). Contex-aware gestures for mixed-initiative text editings UIs. Interacting with Computers. 27(6):675-696. https://doi.org/10.1093/iwc/iwu019S675696276Alabau V. Leiva L. A. Transcribing Handwritten Text Images with a Word Soup Game. Proc. Extended Abstr. Hum. Factors Comput. Syst. (CHI EA) 2012.Alabau V. Rodríguez-Ruiz L. Sanchis A. Martínez-Gómez P. Casacuberta F. On Multimodal Interactive Machine Translation Using Speech Recognition. Proc. Int. Conf. Multimodal Interfaces (ICMI). 2011a.Alabau V. Sanchis A. Casacuberta F. Improving On-Line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation. Proc. Assoc. Comput. Linguistics (ACL) 2011b.Alabau, V., Sanchis, A., & Casacuberta, F. (2014). Improving on-line handwritten recognition in interactive machine translation. Pattern Recognition, 47(3), 1217-1228. doi:10.1016/j.patcog.2013.09.035Anthony L. Wobbrock J. O. A Lightweight Multistroke Recognizer for User Interface Prototypes. Proc. Conf. Graph. Interface (GI). 2010.Anthony L. Wobbrock J. O. N-Protractor: a Fast and Accurate Multistroke Recognizer. Proc. Conf. Graph. Interface (GI) 2012.Anthony L. Vatavu R.-D. Wobbrock J. O. Understanding the Consistency of Users' Pen and Finger Stroke Gesture Articulation. Proc. Conf. Graph. Interface (GI). 2013.Appert C. Zhai S. Using Strokes as Command Shortcuts: Cognitive Benefits and Toolkit Support. Proc. SIGCHI Conf. Hum. Fact. Comput. Syst. (CHI) 2009.Bahlmann C. Haasdonk B. Burkhardt H. On-Line Handwriting Recognition with Support Vector Machines: A Kernel Approach. Proc. Int. Workshop Frontiers Handwriting Recognition (IWFHR). 2001.Bailly G. Lecolinet E. Nigay L. Flower Menus: a New Type of Marking Menu with Large Menu Breadth, within Groups and Efficient Expert Mode Memorization. Proc.Work. Conf. Adv. Vis. Interfaces (AVI) 2008.Balakrishnan R. Patel P. The PadMouse: Facilitating Selection and Spatial Positioning for the Non-Dominant Hand. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 1998.Bau O. Mackay W. E. Octopocus: A Dynamic Guide for Learning Gesture-Based Command Sets. Proc. ACM Symp. User Interface Softw. Technol. (UIST) 2008.Belaid A. Haton J. A syntactic approach for handwritten formula recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1984;6:105-111.Bosch V. Bordes-Cabrera I. Munoz P. C. Hernández-Tornero C. Leiva L. A. Pastor M. Romero V. Toselli A. H. Vidal E. Transcribing a XVII Century Handwritten Botanical Specimen Book from Scratch. Proc. Int. Conf. Digital Access Textual Cultural Heritage (DATeCH). 2014.Buxton W. The natural language of interaction: a perspective on non-verbal dialogues. INFOR 1988;26:428-438.Cao X. Zhai S. Modeling Human Performance of Pen Stroke Gestures. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 2007.Castro-Bleda M. J. España-Boquera S. Llorens D. Marzal A. Prat F. Vilar J. M. Zamora-Martinez F. Speech Interaction in a Multimodal Tool for Handwritten Text Transcription. Proc. Int. Conf. Multimodal Interfaces (ICMI) 2011.Connell S. D. Jain A. K. Template-based on-line character recognition. Pattern Recognition 2000;34:1-14.Costagliola G. Deufemia V. Polese G. Risi M. A Parsing Technique for Sketch Recognition Systems. Proc. 2004 IEEE Symp. Vis. Lang. Hum. Centric Comput. (VLHCC). 2004.Culotta, A., Kristjansson, T., McCallum, A., & Viola, P. (2006). Corrective feedback and persistent learning for information extraction. Artificial Intelligence, 170(14-15), 1101-1122. doi:10.1016/j.artint.2006.08.001Deepu V. Madhvanath S. Ramakrishnan A. Principal Component Analysis for Online Handwritten Character Recognition. Proc. Int. Conf. Pattern Recognition (ICPR). 2004.Delaye A. Sekkal R. Anquetil E. Continuous Marking Menus for Learning Cursive Pen-Based Gestures. Proc. Int. Conf. Intell. User Interfaces (IUI) 2011.Dimitriadis Y. Coronado J. Towards an art-based mathematical editor that uses on-line handwritten symbol recognition. Pattern Recognition 1995;8:807-822.El Meseery M. El Din M. F. Mashali S. Fayek M. Darwish N. Sketch Recognition Using Particle Swarm Algorithms. Proc. 16th IEEE Int. Conf. Image Process. (ICIP). 2009.Goldberg D. Goodisman A. Stylus User Interfaces for Manipulating Text. Proc. ACM Symp. User Interface Softw. Technol. (UIST) 1991.Goldberg D. Richardson C. Touch-Typing with a Stylus. Proc. INTERCHI'93 Conf. Hum. Factors Comput. Syst. 1993.Stevens, M. E. (1968). Selected pattern recognition projects in Europe. Pattern Recognition, 1(2), 103-118. doi:10.1016/0031-3203(68)90002-2Hardock G. Design Issues for Line Driven Text Editing/ Annotation Systems. Proc. Conf. Graph. Interface (GI). 1991.Hardock G. Kurtenbach G. Buxton W. A Marking Based Interface for Collaborative Writing. Proc.ACM Symp. User Interface Softw. Technol. (UIST) 1993.Hinckley K. Baudisch P. Ramos G. Guimbretiere F. Design and Analysis of Delimiters for Selection-Action Pen Gesture Phrases in Scriboli. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 2005.Hong J. I. Landay J. A. SATIN: A Toolkit for Informal Ink-Based Applications. Proc. ACM Symp. User Interface Softw. Technol. (UIST) 2000.Horvitz E. Principles of Mixed-Initiative User Interfaces. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 1999.Huerst W. Yang J. Waibel A. Interactive Error Repair for an Online Handwriting Interface. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI) 2010.Jelinek F. Cambridge, Massachusetts: MIT Press; 1998. Statistical Methods for Speech Recognition.Johansson S. Atwell E. Garside R. Leech G. The Tagged LOB Corpus, User's Manual. Norwegian Computing Center for the Humanities. 1996.Karat C.-M. Halverson C. Horn D. Karat J. Patterns of Entry and Correction in Large Vocabulary Continuous Speech Recognition Systems. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 1999.Kerrick, D. D., & Bovik, A. C. (1988). Microprocessor-based recognition of handprinted characters from a tablet input. Pattern Recognition, 21(5), 525-537. doi:10.1016/0031-3203(88)90011-8Koschinski M. Winkler H. Lang M. Segmentation and Recognition of Symbols within Handwritten Mathematical Expressions. Proc. IEEE Int. Conf. Acoustics Speech Signal Process. (ICASSP). 1995.Kosmala A. Rigoll G. On-Line Handwritten Formula Recognition Using Statistical Methods. Proc. Int. Conf. Pattern Recognition (ICPR) 1998.Kristensson P. O. Discrete and continuous shape writing for text entry and control. 2007. Ph.D. Thesis, Linköping University, Sweden.Kristensson P. O. Denby L. C. Text Entry Performance of State of the Art Unconstrained Handwriting Recognition: a Longitudinal User Study. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 2009.Kristensson P. O. Denby L. C. Continuous Recognition and Visualization of Pen Strokes and Touch-Screen Gestures. Proc. Eighth Eurograph. Symp. Sketch-Based Interfaces Model. (SBIM) 2011.Kristensson P. O. Zhai S. SHARK2: A Large Vocabulary Shorthand Writing System for Pen-Based Computers. Proc. ACM Symp. User Interface Softw. Technol. (UIST). 2004.Kurtenbach G. P. The design and evaluation of marking menus. 1991. Ph.D. Thesis, University of Toronto.Kurtenbach G. P. Buxton W. Issues in Combining Marking and Direct Manipulation Techniques. Proc. ACM Symp. User Interface Softw. Technol. (UIST). 1991.Kurtenbach G. Buxton W. User Learning and Performance with Marking Menus. Proc. Extended Abstr. Hum. Factors Comput. Syst. (CHI EA) 1994.Kurtenbach, G., Sellen, A., & Buxton, W. (1993). An Empirical Evaluation of Some Articulatory and Cognitive Aspects of Marking Menus. Human-Computer Interaction, 8(1), 1-23. doi:10.1207/s15327051hci0801_1LaLomia M. User Acceptance of Handwritten Recognition Accuracy. Proc. Extended Abstr. Hum. Factors Comput. Syst. (CHI EA). 1994.Leiva L. A. Romero V. Toselli A. H. Vidal E. Evaluating an Interactive–Predictive Paradigm on Handwriting Transcription: A Case Study and Lessons Learned. Proc. 35th Annu. IEEE Comput. Softw. Appl. Conf. (COMPSAC) 2011.Leiva L. A. Alabau V. Vidal E. Error-Proof, High-Performance, and Context-Aware Gestures for Interactive Text Edition. Proc. Extended Abstr. Hum. Factors Comput. Syst. (CHI EA). 2013.Li Y. Protractor: A Fast and Accurate Gesture Recognizer. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI) 2010.Li W. Hammond T. Using Scribble Gestures to Enhance Editing Behaviors of Sketch Recognition Systems. Proc. Extended Abstr. Hum. Factors Comput. Syst. (CHI EA). 2012.Liao C. Guimbretière F. Hinckley K. Hollan J. Papiercraft: a gesture-based command system for interactive paper. ACM Trans. Comput.–Hum. Interaction (TOCHI) 2008;14:18:1-18:27.Liu P. Soong F. K. Word Graph Based Speech Rcognition Error Correction by Handwriting Input. Proc. Int. Conf. Multimodal Interfaces (ICMI). 2006.Long A. Landay J. Rowe L. Implications for a Gesture Design Tool. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI) 1999.Long A. C. Jr. Landay J. A. Rowe L. A. Michiels J. Visual Similarity of Pen Gestures. Proc. SIGCHI Conf. Hum. Factors Comput. Syst. (CHI). 2000.MacKenzie, I. S., & Chang, L. (1999). A performance comparison of two handwriting recognizers. Interacting with Computers, 11(3), 283-297. doi:10.1016/s0953-5438(98)00030-7MacKenzie I. S. Tanaka-Ishii K. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 2007. Text Entry Systems: Mobility, Accessibility, Universality.MARTI, U.-V., & BUNKE, H. (2001). USING A STATISTICAL LANGUAGE MODEL TO IMPROVE THE PERFORMANCE OF AN HMM-BASED CURSIVE HANDWRITING RECOGNITION SYSTEM. International Journal of Pattern Recognition and Artificial Intelligence, 15(01), 65-90. doi:10.1142/s0218001401000848Marti, U.-V., & Bunke, H. (2002). The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5(1), 39-46. doi:10.1007/s100320200071Martín-Albo D. Romero V. Toselli A. H. Vidal E. Multimodal computer-assisted transcription of text images at character-level interaction. Int. J. Pattern Recogn. Artif. Intell. 2012;26:1-19.Marzinkewitsch R. Operating Computer Algebra Systems by Hand-Printed Input. Proc. Int. Symp. Symbolic Algebr. Comput. (ISSAC). 1991.Mas, J., Llados, J., Sanchez, G., & Jorge, J. A. P. (2010). A syntactic approach based on distortion-tolerant Adjacency Grammars and a spatial-directed parser to interpret sketched diagrams. Pattern Recognition, 43(12), 4148-4164. doi:10.1016/j.patcog.2010.07.003Moyle M. Cockburn A. Analysing Mouse and Pen Flick Gestures. Proc. SIGCHI-NZ Symp. Comput.–Hum. Interact. (CHINZ). 2002.Nakayama Y. A Prototype Pen-Input Mathematical Formula Editor. Proc. AACE EdMedia 1993.Ogata J. Goto M. Speech Repair: Quick Error Correction Just by Using Selection Operation for Speech Input Interface. Proc. Eurospeech. 2005.Ortiz-Martínez D. Leiva L. A. Alabau V. Casacuberta F. Interactive Machine Translation using a Web-Based Architecture. Proc. Int. Conf. Intell. User Interfaces (IUI) 2010.Ortiz-Martínez D. Leiva L. A. Alabau V. García-Varea I. Casacuberta F. An Interactive Machine Translation System with Online Learning. Proc. Assoc. Comput. Linguist. (ACL). 2011.Michael Powers, V. (1973). Pen direction sequences in character recognition. Pattern Recognition, 5(4), 291-302. doi:10.1016/0031-3203(73)90022-8Raab F. Extremely efficient menu selection: Marking menus for the Flash platform. 2009. Available at http://www.betriebsraum.de/blog/2009/07/21/efficient-gesture-recognition-and-corner-finding-in-as3/ (retrieved on May 2012).Revuelta-Martínez A. Rodríguez L. García-Varea I. A Computer Assisted Speech Transcription System. Proc. Eur. Chap. Assoc. Comput. Linguist. (EACL). 2012.Revuelta-Martínez, A., Rodríguez, L., García-Varea, I., & Montero, F. (2013). Multimodal interaction for information retrieval using natural language. Computer Standards & Interfaces, 35(5), 428-441. doi:10.1016/j.csi.2012.11.002Rodríguez L. García-Varea I. Revuelta-Martínez A. Vidal E. A Multimodal Interactive Text Generation System. Proc. Int. Conf. Multimodal Interfaces Workshop Mach. Learn. Multimodal Interact. (ICMI-MLMI). 2010a.Rodríguez L. García-Varea I. Vidal E. Multi-Modal Computer Assisted Speech Transcription. Proc. Int. Conf. Multimodal Interfaces Workshop Mach. Learn. Multimodal Interact. (ICMI-MLMI) 2010b.Romero V. Leiva L. A. Toselli A. H. Vidal E. Interactive Multimodal Transcription of Text Images using a Web-Based Demo System. Proc. Int. Conf. Intell. User Interfaces (IUI). 2009a.Romero V. Toselli A. H. Vidal E. Using Mouse Feedback in Computer Assisted Transcription of Handwritten Text Images. Proc. Int. Conf. Doc. Anal. Recogn. (ICDAR) 2009b.Romero V. Toselli A. H. Vidal E. Study of Different Interactive Editing Operations in an Assisted Transcription System. Proc. Int. Conf. Multimodal Interfaces (ICMI). 2011.Romero V. Toselli A. H. Vidal E. Vol. 80. Singapore: World Scientific Publishing Company; 2012. Multimodal Interactive Handwritten Text Transcription.Rubine, D. (1991). Specifying gestures by example. ACM SIGGRAPH Computer Graphics, 25(4), 329-337. doi:10.1145/127719.122753Rubine D. H. 1991b. The automatic recognition of gestures. Ph.D. Thesis, Carnegie Mellon University.Sánchez-Sáez R. Leiva L. A. Sánchez J. A. Benedí J. M. Interactive Predictive Parsing using a Web-Based Architecture. Proc. North Am. Chap. Assoc. Comput. Linguist. 2010.Saund E. Fleet D. Larner D. Mahoney J. Perceptually-Supported Image Editing of Text and Graphics. Proc. ACM Symp. User Interface Softw. Technol. (UIST) 2003.Shilman M. Tan D. S. Simard P. CueTIP: a Mixed-Initiative Interface for Correcting Handwriting Errors. Proc. ACM Symp. User Interface Softw. Technol. (UIST). 2006.Signer B. Kurmann U. Norrie M. C. igesture: A General Gesture Recognition Framework. Proc. Int. Conf. Doc. Anal. Recogn. (ICDAR) 2007.Smithies S. Novins K. Arvo J. A handwriting-based equation editor. Proc. Conf. Graph. Interface (GI). 1999.Suhm, B., Myers, B., & Waibel, A. (2001). Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction, 8(1), 60-98. doi:10.1145/371127.371166Tappert C. C. Mosley P. H. Recent advances in pen computing. 2001. Technical Report 166, Pace University, available: http://support.csis.pace.edu.Toselli, A. H., Romero, V., Pastor, M., & Vidal, E. (2010). Multimodal interactive transcription of text images. Pattern Recognition, 43(5), 1814-1825. doi:10.1016/j.patcog.2009.11.019Toselli A. H. Vidal E. Casacuberta F. , editors. Berlin, Heidelberg, New York: Springer; 2011. Multimodal-Interactive Pattern Recognition and Applications.Tseng S. Fogg B. Credibility and computing technology. Commun. ACM 1999;42:39-44.Vatavu R.-D. Anthony L. Wobbrock J. O. Gestures as Point Clouds: A P Recognizer for User Interface Prototypes. Proc. Int. Conf. Multimodal Interfaces (ICMI). 2012.Vertanen K. Kristensson P. O. Parakeet: A Continuous Speech Recognition System for Mobile Touch-Screen Devices. Proc. Int. Conf. Intell. User Interfaces (IUI) 2009.Vidal E. Rodríguez L. Casacuberta F. García-Varea I. Mach. Learn. Multimodal Interact., Lect. Notes Comput. Sci. Vol. 4892. Berlin, Heidelberg: Springer; 2008. Interactive Pattern Recognition.Wang X. Li J. Ao X. Wang G. Dai G. Multimodal Error Correction for Continuous Handwriting Recognition in Pen-Based User Interfaces. Proc. Int. Conf. Intell. User Interfaces (IUI). 2006.Wang L. Hu T. Liu P. Soong F. K. Efficient Handwriting Correction of Speech Recognition Errors with Template Constrained Posterior (TCP). Proc. INTERSPEECH 2008.Wobbrock J. O. Wilson A. D. Li Y. Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes. Proc. ACM Symp. User Interface Softw. Technol. (UIST). 2007.Wolf C. G. Morrel-Samuels P. The use of hand-drawn gestures for text editing. Int. J. Man–Mach. Stud. 1987;27:91-102.Zeleznik R. Miller T. Fluid Inking: Augmenting the Medium of Free-Form Inking with Gestures. Proc. Conf. Graph. Interface (GI). 2006.Yong Zhang, McCullough, C., Sullins, J. R., & Ross, C. R. (2010). Hand-Drawn Face Sketch Recognition by Humans and a PCA-Based Algorithm for Forensic Applications. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 40(3), 475-485. doi:10.1109/tsmca.2010.2041654Zhao S. Balakrishnan R. Simple vs. Compound Mark Hierarchical Marking Menus. Proc. ACM Symp. User Interface Softw. Technol. (UIST) 2004

    A Set of Benchmarks for Handwritten Text Recognition on Historical Documents

    Full text link
    [EN] Handwritten Text Recognition is a important requirement in order to make visible the contents of the myriads of historical documents residing in public and private archives and libraries world wide. Automatic Handwritten Text Recognition (HTR) is a challenging problem that requires a careful combination of several advanced Pattern Recognition techniques, including but not limited to Image Processing, Document Image Analysis, Feature Extraction, Neural Network approaches and Language Modeling. The progress of this kind of systems is strongly bound by the availability of adequate benchmarking datasets, software tools and reproducible results achieved using the corresponding tools and datasets. Based on English and German historical documents proposed in recent open competitions at ICDAR and ICFHR conferences between 2014 and 2017, this paper introduces four HTR benchmarks in order of increasing complexity from several points of view. For each benchmark, a specific system is proposed which overcomes results published so far under comparable conditions. Therefore, this paper establishes new state of the art baseline systems and results which aim at becoming new challenges that would hopefully drive further improvement of HTR technologies. Both the datasets and the software tools used to implement the baseline systems are made freely accessible for research purposes. (C) 2019 Elsevier Ltd. All rights reserved.This work has been partially supported through the European Union's H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), as well as by the BBVA Foundation through the 2017-2018 and 2018-2019 Digital Humanities research grants "Carabela" and "HisClima - Dos Siglos de Datos Cilmaticos", and by EU JPICH project "HOME - History Of Medieval Europe" (Spanish PEICTI Ref. PC12018-093122).Sánchez Peiró, JA.; Romero, V.; Toselli, AH.; Villegas, M.; Vidal, E. (2019). A Set of Benchmarks for Handwritten Text Recognition on Historical Documents. Pattern Recognition. 94:122-134. https://doi.org/10.1016/j.patcog.2019.05.025S1221349

    ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016)

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] The H-KWS 2016, organized in the context of the ICFHR 2016 conference aims at setting up an evaluation framework for benchmarking handwritten keyword spotting (KWS) examining both the Query by Example (QbE) and the Query by String (QbS) approaches. Both KWS approaches were hosted into two different tracks, which in turn were split into two distinct challenges, namely, a segmentation-based and a segmentation-free to accommodate different perspectives adopted by researchers in the KWS field. In addition, the competition aims to evaluate the submitted training-based methods under different amounts of training data. Four participants submitted at least one solution to one of the challenges, according to the capabilities and/or restrictions of their systems. The data used in the competition consisted of historical German and English documents with their own characteristics and complexities. This paper presents the details of the competition, including the data, evaluation metrics and results of the best run of each participating methods.This work was partially supported by the Spanish MEC under FPU grant FPU13/06281, by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, grant Ref. 674943).Pratikakis, I.; Zagoris, K.; Gatos, B.; Puigcerver, J.; Toselli, AH.; Vidal, E. (2016). ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016). IEEE. https://doi.org/10.1109/ICFHR.2016.0117
    corecore